Control Task for Reinforcement Learning with Known Optimal Solution for Discrete and Continuous Actions
نویسندگان
چکیده
The overall research in Reinforcement Learning (RL) concentrates on discrete sets of actions, but for certain real-world problems it is important to have methods which are able to find good strategies using actions drawn from continuous sets. This paper describes a simple control task called direction finder and its known optimal solution for both discrete and continuous actions. It allows for comparison of RL solution methods based on their value functions. In order to solve the control task for continuous actions, a simple idea for generalising them by means of feature vectors is presented. The resulting algorithm is applied using different choices of feature calculations. For comparing their performance a simple measure is introduced.
منابع مشابه
Model-Based Reinforcement Learning with Continuous States and Actions
Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the tran...
متن کاملOptimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...
متن کاملLearning Continuous Action Models in a Real-Time Strategy Envir
Although several researchers have integrated methods for reinforcement learning (RL) with case-based reasoning (CBR) to model continuous action spaces, existing integrations typically employ discrete approximations of these models. This limits the set of actions that can be modeled, and may lead to non-optimal solutions. We introduce the Continuous Action and State Space Learner (CASSL), an int...
متن کاملReinforcement Learning C3.3 Delayed reinforcement learning
See the abstract for Chapter C3. Delayed reinforcement learning (RL) concerns the solution of stochastic optimal control problems. In this section we formulate and discuss the basics of such problems. Solution methods for delayed RL will be presented in Sections C3.4 and C3.5. In these three sections we will mainly consider problems in which C3.4, C3.5 the state and control spaces are finite se...
متن کاملInvestigating Reinforcement Learning Agents for Continuous State Space Environments
Given an environment with continuous state spaces and discrete actions, we investigate using a Double Deep Q-learning Reinforcement Agent to find optimal policies using the LunarLander-v2 OpenAI gym environment.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JILSA
دوره 1 شماره
صفحات -
تاریخ انتشار 2009